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1, INTRODUCTION 

Bambara groundnut, or the scientific name Vigna subterranea (L) Verdc. is originally planted in the 
African continent and has been cultivated in tropical Africa for centuries [1]. It has been planted in Malaysia 
due to similar weather condition but one of the challenges planting it here is that it can easily be infected with 
leaf diseases after heavy rains [1]. In order to minimize the leaf disease that induced damage during the 
growth of bambara groundnut, harvest and post-harvest processing, as well as maximize productivity and 
ensure agricultural sustainability, automatic leaf disease recognition is highly important [2]. The existing 
method for leaf plant disease recognition is simply applying the naked eye observation by experts [3]. In 
doing so, a large team of experts as well as continuous monitoring of plant 1s required, which incur costs for 
large farms [3]. Plant disease recognition by visual way is more laborious and time consuming and at the 
same time, less accurate and can be done only in limited areas [4]. 

In order to adapt to this fast changing environment, appropriate and timely plant leaf disease 
recognition is crucial. However, most plant leaf diseases generate some kind of manifestation in the visible 
spectrum, so the naked eye examination of a trained professional is the prime technique adopted in practice 
for plant disease recognition [5]. An automated plant leaf disease recognition system could be of great help 
for amateurs in the gardening process and also trained professionals as a verification system in disease 
diagnostics [6]. Various features and classifiers have been investigated to recognize plant diseases 
automatically [7]-[10]. Colour features and Back-Propagation Neural Network (BPNN) have been used for 
cotton and groundnut diseases classification [7]. Shape and colour features with Support Vector Machine 
(SVM) classifier have been utilized to classify rice-plant diseases [8]. SVM has also been used to classify 
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cotton leaf spot disease in [9]. A comparative study has been performed among various texture features 
namely Local Binary Pattern (LBP) and Gray Level Co-occurrence Matrix (GLCM) and classifiers that are 
Probabilistic Neural Network (PNN), BPNN, SVM and Random Forest (RF) to classify diseases in grapes 
and the results indicate that GLCM with RF achieve the best recognition results [11]. Convolutional Neural 
Network (CNN) is getting popular in object recognition problems such as leaf recognition [13-14], fruit 
recognition [15-16], character recognition [17], vehicle recognition [18] and palm oil fresh fruit bunch 
ripeness grading recognition [19]. Plant disease classification based on CNN produce outstanding accuracy 
results [20]. BoF, one of the many machine learning techniques, has also shown good performance in object 
recognition [21-22]. Due to promising results produced by BoF and CNN, this research plans to investigate 
their performances in recognizing bambara groundnut leaf disease. 


2. RESEARCH METHOD 
2.1. Convolutional Neural Network (CNN) 

The architecture of CNN is structured as a series of layers, that consists of three layers which are 
convolve layer, pooling layer and Rectified Linear unit (ReLu) [16]. Convolve layer extracts features of an 
image using filter and image patch that strides over the input image. ReLu layer replaces all negative pixel 
values in the feature map with zero while pooling layer allows the feature map to be down-sampled after 
ReLu layer to reduce the dimensionality. Max-pooling computes the maximum local of feature map. 
Neighboring pooling takes input from feature maps that are shifted or stride by more than one rows or 
columns. Figure 1 shows the architecture of a CNN [23]. 
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Figure 1. The architecture of CNN [26] 


2.2. Bag of Features (BoF) 

One method that represents images as orderless collections of local features is called Bag of 
Features (BoF) [22]. In this project, Speeded up Robust Features (SURF) has been used in BoF because the 
performance of this feature is excellent and only require low computational cost [24]. It is a descriptor that is 
based on Hessian matrix measures and an image detector. For a descriptor which uses only 64 dimensions 
leading to quick feature extraction, and it also uses a 2D Haar wavelet transform [24].The two common 
perspectives for the BoF image representation explanation which the first one is the by analogy from the Bag 
of Words representation. One represents a document that normalizes histogram of word counts with Bag of 
Words, [25]. Figure 2 shows the process for BoF image representation. 
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Figure 2. Process for BoF image representation [22] 
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3. RESULTS AND ANALYSIS 


3.1. The Dataset 

A new dataset of the bambara groundnut leaf images has been constructed that consists of 200 
images of the non-infected leaves and 200 images of leaves with diseases. They were captured from a farm in 
Semenyih, Selangor using a mobile phone. Some sample images of bambara groundnut with and without 


diseases are illustrated in Figure 3 and Figure 4. 
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Figure 3. Some sample images of bambara groundnut without leaf diseases 
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Figure 4. Some sample images of bambara groundnut with leaf diseases 
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3.2. CNN 

A stack of CNN consist of convolve layer, pooling layer and ReLu layer while additional stack of 
layers can be added to improve the performance. CNN takes color images and the features are automatically 
extracted by the convolve layers. The size of filters in the convolve layer and the value of stride in the 
pooling layer represent the number of columns to be skipped for the sliding window through the image. 
These values can be changed as they can affect the result of the recognition performance. Besides that, the 
values of epochs represent the number of iteration for the training process and initial learning rate that 
represent the value of the weight to be adjusted during the training process, can be changed to view their 
effect to the recognition rate. The image size required for basic CNN is 224 x 224 pixels. Experimental 
results were conducted on the combination of these values and the results are shown in Table 1. The first 
column represents the size of the filter and the number of filters in the convolve layer. By referring to Table 
1, we can see that a 100% accuracy is achieved with [5,20] in the first convolve layer and [3,32] in the 
second convolve layer, and Figure 5 shows the results of this training and validation processes. 
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Figure 5. The result of CNN with [5,20] in the first convolve layer and [3,32] in the second convolve layer 
Table 1. Experimental Results on Parameter Tuning for Basic CNN 


No of Stack of Layers Convolve Layer Pooling layer and Stride Accuracy (%) Total Time/s 
1 [3,16] | 78.82 30 sec 
[5,20] 3 83.59 28 sec 
2 [3,16] [3,32] | 1.79 27 sec 
[5, 20][3,32] 2 100.00 21 sec 
3 [5s201(3,521 13.32] Z 75.90 41 sec 
[5, 20][3,32] [3,16] 2 74.87 34 sec 


By looking at Table 1, we can see that as the number of layers increases, the accuracy is also 
increased. But when the number of layers 1s more than 3, the accuracy begins to drop. This means that two 
stacks of layers plus 1 classification layer produce the best accuracy for bambara groundnut lead disease 
recognition. 


3.3. Bag of Features (BoF) 

The size of an image used for BoF is 227 x 227 pixels and the accuracy produced is 100%. Figure 6 
shows the result of visual words occurrence produced by BoF for our dataset. Speeded-Up Robust Feature 
(SURF) and Support Vector Machine (SVM) is being utilized in the BoF. 
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Figure 6. Visual word occurrences result 


Table 2 shows an overview of the accuracy performance of CNN compared to BoF based on our 
bambara groundnut leaf dataset. By looking at Table 2, we can see that BoF is better than basic CNN but it 
took a longer time to achieve this result. This is because extracting of the SURF features is longer compared 
to the time to extract the low-level and middle level features by the CNN. 


Table 2. The Performance Overview for Basic CNN and BoF for Bambara Dataset 


Model Basic CNN BoF 
Validation accuracy 100 100 
Elapsed Time (s) Zt 3] 


4. CONCLUSION 

In this paper, a comparison between CNN and BoF was performed with respect to accuracy and 
elapsed time. The experiment results show that BoF achieved the same accuracy rate as CNN which is 100%. 
However, BoF requires a higher elapsed time due to the large number of SURF features required to be 
extracted. Although the number of layers affects the accuracy performance, the complexity of the CNN 
architecture does not guarantee a better result. The experimental results in this research indicate that two stacks 
of layers produce better accuracy compared to three stacks of layers. The use of CNN is recommended for leaf 
disease recognition if the processing time is not an issue. For future work, more deep learning models and 
publicly available datasets will be tested. 
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